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A NOVEL TUMOR MARKER AND NOVEL ME THOD OF ISOLATING SAME 

FTELD OF THE INVENTION 
This invention relates to proteins that serve as tumor 
markers for human carcinoma and to methods of isolating 
differentially expressed genes, 

GOVE. ^MENT RIGHTS 

This invention was made in part with U,S. Government 
support. Therefore, the U.S. Government has certain rights 
in the invention. 

BACKGROUND OF THE INVENTION 
Tumor markers for human tumor cells have been largely 
limited to activated oncogenes and their products, for 
example, the myc, ras, fos, and erbB2 genes and their encoded 
oncoproteins. In addition, activated anti-oncogenes , such 
as RB, p53, and DCC, have been identified in normal cells but 
do not appear to be present in tumor cells. Oncogene and 
anti-oncogene products have proven difficult to use as 
consistent predictors of tumor and normal tissue, 
respectively, due to the relatively low level of expression 
of the genes encoding these proteins. Thus, there is a need 
in the art for a tumor marker which is not only 
differentially expressed in tumor and normal tissue, but also 
consistently detectable in human tumor tissue and 
consistently absent in the corresponding normal tissue. 

A common method used to identify genes differentially 
or uniquely expressed in tumors, in cells responding to 
growth factors, and in differentiated cell types such as, 
among others, T cells, adipocytes, neurons, and hepatocytes 

is the subtractive hybridization technique (S.W. Lee et al., 
Proc. Natl. Acad. Sci. USA 80:4699, 1983). A method of 
differential display of eukaryotic mRNA by means of the 
polymerase chain reaction (PCR) has recently been developed 

(P. Liang et al, , Science 257:967, 1992). This method 

SUBSTITUTE SHEET IRULE 26) 
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utilizes oligo dT linked to two additional bases as the 
primer for reverse transcription driven by reverse 
transcriptase. cDNA fragments are then amplified by Taq DNA 
polymerase-based PGR using an oligo dT primer along with one 
additional primer. The amplified cDNAs are then resolved by 
DNA sequencing gels. There is a need in the art for a 
streamlined and simplified process for isolating cDNAs 
corresponding to differentially expressed mRNAs. 

gTTMM&PV OF THE INVENTION 
The invention is based on the discovery of a novel 
protein, TCI (SEQ ID N0:4), which is a tumor marker, 
particularly for invasive and metastatic tumors, and the gene 

encoding this protein. 

The invention thus encompasses the TCI protein (SEQ ID 
NO:4), which is useful as a tumor marker for tumor diagnosis 
and therapy, particularly for colorectal, breast, and 
gastrointestinal tumors, and for metastatic tumors emanating 
from these tumor types, TCI is also a useful marker in 
general for tumor cell invasion and metastasis- mRNA 
encoding TCI is not expressed in most cultured tumor cells, 
i.e., in vitro, but is expressed once these cells are grown 
in vivo. Because later stage and deeply invasive tumors 
contain higher levels of TCI protein than other tumor 
tissues, TCI appears to be a particularly useful marker for 

later stage cancers* 

TCI protein may also serve as a target in tumor targeted 
therapy to prevent tumor cell metastasis and thus invasion 
of additional organs. For example, a polypeptide fragment 
of the TCI protein may be used as an antagonist of TCI 
biological activity; e.g., where TCI biological activity 
includes invasion and metastasis, the polypeptide fragment 
may be administered to a patient afflicted with the tumor in 
order to inhibit the spread of the tumor to other tissues. 
Alternatively, a truncated portion of TCI which retains the 
invasive and metastatic biological activities of the full- 



WO 95/11923 



PCT/US94/12502 



- 3 - 

length molecule will be useful for screening for antagonists 
of TCI activity* Potentially useful polypeptides are 

described herein. 

The invention also encompasses nucleotide probes based 
5 on the TCI nucleotide sequence; e.g., 10, 20, 30, 40, etc. 

nucleotides in length. Such probes are useful for PCR-based 
tumor detection and in situ hybridization of tumor tissue 
sections* In addition, probes whose nucleotide sequences are 
based on homologies with other genes or proteins having 
10 sequences related to TCI, i.e., genes of the TCI family, two 

of which are described herein, are useful for detecting 
additional genes belonging to the TCI family of genes. 

The invention thus also encompasses methods of screening 
for agents which inhibit expression of the TCI gene (SEQ ID 

15 NO: 3) 

in vitro, comprising exposing a metastatic cell line in which 
TCI mRNA is detectable in cultured cells to an agent 
suspected of inhibiting production of the TCI mRNA; and 
determining the level of TCI mRNA in the exposed cell line, 

2 0 wherein a decrease in the level of TCI mRNA after exposure 

of the :ell line to the agent is indicative of inhibition of 
TCI mRNA production. 

Alternatively, the screening method may include in vitro 
screening of a metastatic cell line in which TCI protein is 
25 detectable in cultured cells to an agent suspected of 

inhibiting production of the TCI protein; and determining the 
level of TCI protein in the cell line, wherein a decrease in 
the level of TCI protein after exposure of the cell line to 
the agent is indicative of inhibition of TCI protein 

3 0 production. 

The invention also encompasses in vivo methods of 
screening for agents which inhibit expression of the TCI 
gene, comprising 

exposing a mammal having tumor cells in which TCI mRNA or 
3 5 protein is detectable to an agent suspected of inhibiting 

production of TCI mRNA or protein; and determining the level 
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of TCI mRNA or protein in tumor cells of the exposed mammal. 
A decrease in the level of TCI mRNA or protein after exposure 
of the mammal to the agent is indicative of inhibition of TCI 

gene expression. 

These screening methods are particularly applicable to 
breast tumor cells, colon tumor cells, or tumor cells of the 
gastrointestinal tract . 

The invention also encompasses a pharmaceutical 
composition for use in treating a late stage cancer, 
comprising an effective amount of an inhibitor of TCI, and 
a method of treating late stage cancer, comprising 
administering to a mammal afflicted with a late stage cancer 
a therapeutically effective amount of an inhibitor of TCI. 
Late stage cancers include those which have become deeply 
invasive in a tissue or which have metastasized to other 
tissues* 

TCI is detectable in patient blood, urine, sputum or 
other body fluid using a monoclonal antibody specific for a 
TCI epitope. Thus, the invention also encompasses antibodies 
specific for TCI, which can easily be prepared in a kit form. 
Monoclonal antibodies specific for TCI may be used for tumor 
imaging to localize tumor position and size. TCl-specific 
monoclonal antibodies are also useful as screening and 
diagnostic agents in immunohistochemical staining of tissue 
sections to distinguish tumor cells from normal cells. Thus, 
anti-TCl antibodies are particularly useful where they 
recognize cells which produce the TCI protein wh^n such cells 
are paraffin-embedded and/or formalin-fixed. One example of 
such an antibody is the monoclonal antibody anti-TCl-1 
produced by the hybridoma deposited with the American Type 
Culture Collection as ATCC Deposit No. HB 11481. 

In another aspect, the invention also features a novel 
method, called palindromic PCR, for identifying and isolating 
a gene, e.g., a gene which is differentially expressed in 
different types of tissues. The method is based on the use 
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of short DNA primers and corresponding palindromic nucleotide 
sequences in the nucleotide sequence to be isolated* 

Thus, the invention encompasses a method for producing 
a double stranded cDNA that includes the steps of contacting 
an mRNA with a DNA primer under stringent hybridization 
conditions to form a first hybrid molecule, the primer having 
a length of from 8 to 12 nucleotides and, preferably, 9 to 
11 nucleotides; subjecting the first hybrid molecule to an 
enzyme having reverse transcriptase activity, to produce a 
first DNA strand complementary to at least a portion of the 
mRNA; contacting the first DNA strand with the primer under 
stringent hybridization conditions to form a second hybrid 
molecule; and subjecting the second hybrid molecule to an 
enzyme having DNA polymerase activity, to produce a second 
DNA strand complementary to the first DNA strand. 
Preferably, the method also includes the step of amplifying 
the first and second DNA strands. 

In preferred embodiments, a single enzyme provides both 
the reverse transcriptase activity and the DNA polymerase 
activity. One example of a suitable such enzyme is rTth DNA 
polymerase from the thermophilic eubacterium Thermus 
thermophilus . 

As used herein, the term "palindromic nucleotide 
sequences" means that a double stranded DNA molecule contains 
a specific DNA sequence in both its coding strand and its 
anti-parallel strand, when those strands are read in the same 
direction, e.g., 5* to 3*. 

The specific sequence of the DNA primer is arbitrary in 
that it is based upon individual judgment. In some 
instances, the sequence can be entirely random or partly 
random for one or more bases. Preferably, the GC content of 
the primer is between 40% and 60%, most preferably about 50%. 
In other instances, the arbitrary sequence can be selected 
to contain a specific ratio of each deoxynucleotide. The 
arbitrary sequence can also be selected to contain, or not 
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to contain, a recognition site for a specific restriction 
endonuclease. 

The DNA primer can contain a sequence that is known to 
be a "consensus sequence 11 of an inRNA of known sequence. As 
defined herein, a "consensus sequence" is a sequence that has 
been found in a gene family of proteins having a similar 
function or similar properties. The use of a primer that 
includes a consensus s {uence may result in the cloning of 
additional members of a desired gene family. 

Palindromic PCR enables genes that are altered in their 
frequency of expression, as well as those that are 
constitutively or differentially expressed, to be identified 
by simple visual inspection and isolated. The method also 
allows the cloning and sequencing of selected mRNAs, so that 
the investigator may determine the relative desirability of 
the gene product prior to screening a comprehensive cDNA 
library for the full length gene product. 

Further objects and advantages of the invention will be 
apparent in light of the following description and the 
claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a polyacrylamide gel of size-separated cDNAs 
that were reverse transcribed from paired mRNAs from colon 
carcinoma (T) and adjacent normal colon tissue (N) and 
subsequently amplified. 

Fig. 2A is a gel in which the TCI cDNA fragment 
identified in Fig. 1 was recovered and re-amplified. 

Fig. 2B is a Northern Blot of three pairs of RNA from 
colon carcinoma (T) and their adjacent normal colon tissue 
(N) probed with 32P-labeled TCI cDNA. 

Fig. 3 shows the nucleotide sequence (described herein 
as SEQ ID N0:1) and corresponding amino acid sequence (SEQ 
ID NO: 2) of the 636 bp partial TCI clone. 
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Fig. 4 shows the nucleotide sequence (SEQ ID NO: 3) and 
corresponding amino acid sequence (SEQ ID NO: 4) of the full- 
length TCI gene and protein* 

Fig. 5 is a sequence comparison of the four internal 
homologous domains of TCI (SEQ ID NO: 4), each approximately 

135 amino acids. 

Fig, 6A is a schematic representation of the four 

repeats of TCI. 

Fig. 6B is a proposed schematic arrangement of the four 

repeated domains and the N- and C-terminal domains* 

Fig. 7 shows the amino acid sequence identity between 
TCI (SEQ ID NO: 4) and Big-h3 (SEQ ID NO: 17). 

Fig. 8 is a Northern Blot of five pairs of RNA from 
colon carcinoma (T) and adjacent normal colon tissue (N) 
probed with 32P-labeled Big-h3 cDNA, the bottom panel 
representing control RNA probed with 32P-labeled B-actin. 

Fig* 9 shows amino acid sequence homology between TCI 
(SEQ ID N0:4) and Fasciclin I from Grasshopper (GrF) (SEQ ID 
NO: 18) and Drosophila (DrF) (SEQ ID NO: 19). 

Fig. 10 is a Schematic representation showing that, on 
average, in every 2 02 bases of sequence in one strand of 
cDNA, there is one 9-base sequence exactly palindromic to 
that in a region of its antiparallel strand. 

Fig. 11 shows the relationship between the palindromic 
frequency and the number of bases in a putative DNA primer, 
as determined by cDNA Matrix analysis. 

Fig. 12 is a schematic representation of the method of 
the invention, palindromic PCR, driven by the enzyme rTth DNA 
polymerase with one DNA primer in one reaction tube; the 
dotted line indicates mRNA and the solid line indicates cDNA; 
the short jagged line represents the single DNA primer. 

Fig . 13A shows the effect of the length of the DNA 
primer on the cDNA amplification patterns; the length and 
nucleotide sequence of each primer are: A, 8 -mer (5'- 
TGTCGAGA ) ; B' , 9 -mer ( 5 ' -TGTCC AGAC ) ; C ' , 10 -mer ( 5 ' - 
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TGTCCAGATG) (SEQ ID NO:5); D' , 11-mer ( 5 ' -TGTCCAGATGC) (SEQ 
ID NO: 6); E', 12-mer ( 5 ' -TGTCCAGATGAC) (SEQ ID NO: 7). 

Fig. 13B shows the effect of the GC content of the DNA 
primer on the cDNA amplification patterns; the GC content and 
5 nucleotide sequence of each primer (10-mer) are: A', 40% 

( 5 ' -TGTCCAGATA) (SEQ IDN0:8); B', 50% ( 5 ' -TGTCCAGATG ) (SEQ 
IDN0:5); C, 60% ( 5 ' -TGTCCAGACG) (SEQ ID NO:9): D' , 70% (5'- 
TGTCCAGCCG) (SEQ ID NO:10); E', 80% ( 5 ' -TGTCCCGCCG) (SEQ ID 
NO: 11); F', 90% (5'-TGCCCGGCCG) (SEQ ID NO: 12). 

10 Fig. 13C shows the effect of the sequence specificity 

of the DNA primer on the cDNA amplification patterns; 10-mer 
primers with the same GC content but different sequences are: 
A', 5 ' -TGATGCACTC (SEQ ID NO:13); B', 5 ' -TGAGCTACTC (SEQ ID 
NO: 14); C, 5 ' -TGACTGACTC (SEQ ID NO: 15). 

15 Fig. 13D shows palindromic PCR performed by rTth DNA 

polymerase (A) with reverse transcription cycles (RT cycles) 
and (B) without RT cycles. 

Fig. 14 shows the identification of differentially 
expressed genes in human colon carcinoma. 

20 Fig. 15 shows reamplif ication of the TCI cDNA fragment 

isolated from colon carcinoma; the PCR product was analyzed 
on a 1.0% agarose gel; a 0.63 Kb cDNA fragment (arrow) was 
detected. 

Fig. 16 is an autoradiogram of DNA sequencing gels 
2 5 showing the presence of FP1 primer sequence ( 5 ' -CTGATCCATG) 

(SEQ ID NO: 16) at the 5 '-end of both strands of the TCI cDNA 

fragment; cloning sites are indicated by arrows, sequences 

below arrows are pBS (KS) vector sequences reading from T3 

primer and T7 primer. 
30 Fig. 17 is a Northern Blot of 24 pairs of colon 

carcinoma (T) and their adjacent normal tissue (N) probed 

with 32P-labeled TCI cDNA. 

Fig. 18 shows the Tumor/Normal RNA Ratio from Northern 

Blot results of Fig. 17. 
35 pig. 19 is a Northern Blot of RNA from carcinoma cells 

which result from metastasis from colon carcinoma to liver 
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(LM) and their adjacent normal liver (NL) probed with 32 P- 

labeled TCI cDNA. 

Fig. 2 0 shows a Northern Blot of RNA from breast cancer 
cell line MCF-7 (1) and colon cancer cell line CX-1 (2) 
cultured in vitro, and MCF-7 tumor (3) and CX-1 tumor (4) 
grown in vivo in nude mice. 

Fig. 21 shows staining of formalin-fixed and paraffin- 
embedded colon tumor tissue sections using the monoclonal 
antibody anti-TCl~l and avidin-biotin-peroxidase detection. 

Fig. 22 shows staining as described in Fig. 21, except 
that panels A, C, D represent breast invasive ductal 
carcinoma and panel B, normal breast tissue. 

Fig. 2 3 shows staining as in Fig. 13 , except that 
panels A and B represent gastric carcinoma, and panels C 
and D, deeply invasive colon carcinoma. 

Fig. 24 is a Western Blot analysis of protein samples 
from two pairs of colon carcinoma and their adjacent normal 
colon (A) and two pairs of breast carcinoma and their 
adjacent normal breast (B) , using a monoclonal antibody 
against TCI protein as a probe. 

Fig. 2 5A shows the ethidium bromide staining pattern of 
an RNA gel in which the same amount of RNA from JMN (1) and 
JMN1B (2) cells is loaded per lane. 

Fig. 25B is a Northern Blot analysis of RNA from 
malignant mesothelioma cells JMN1B (2) and JMN (1) using TCI 

cDNA as a probe. 

Fig. 2 6 is a Western blot using a monoclonal antibody 
against TCI to probe JMN1B cells grown in conditioned medium 
and whole cell lysate. 

Fig. 27 shows JMN1B cells fixed with paraformaldehyde 
without subsequent permeabilization in panels A and B, and 
JMN1B cells fixed with paraformaldehyde and then 
permeabilized in panels C and D. 

Figs. 28A-28D show the corrected nucleotide sequence and 
corresponding amino acid sequence of the full length TCI gene 
and protein. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 
TCI (SEQ ID NO: 4) is a novel protein that is found in 
invasive and metastatic tumor cells. The nucleotide sequence 
(SEQ ID NO: 3) encoding TCI was found using a novel technique 
described herein as palindromic PCR, a technique which 
enables identification and cloning of a gene that is 
differentially expressed in tissues. Cloning and sequencing 
of the gene encoding TCI and characterization of the protein 
is described below, along with examples of how the protein 
is detected in invasive and metastatic cancers. Examples 
describing additional uses of the TCI protein and its 
fragments, the nucleotide sequence encoding TCI and fragments 
thereof, and antibodies specific for TCI are also included. 

Identification, Cloning and Detection of Expression of the 
TCI Gene 

The identification, cloning, and differential detection 
of expression of the TCI gene (SEQ ID NO: 3) was performed as 
follows. A 636 bp cDNA fragment (SEQ ID NO:l) containing TCI 
sequences was identified and isolated by a rapid method 
termed palindromic PCR, described herein, from human surgical 
colon carcinoma tissue. Briefly, paired mRNAs were isolated 
from colon carcinoma tissue and adjacent normal colon tissue 
from the same patient, then matched mRNAs were reverse 
transcribed to cDNA and subsequently amplified by the 
palindromic PCR method described herein, which utilizes one 
DNA primer. Both reverse transcription and PCR reactions 
were driven by a single enzyme, rTth DNA polymerase, in a 
single tube. 35 S or 33 P-labeled PCR cDNA fragments were 
resolved on a DNA sequencing gel. As shown in Fig, 1, paired 
mRNAs from colon carcinoma (T) and adjacent normal colon 
tissue (N) were reverse transcribed to cDNA and subsequently 
amplified by palindromic PCR. 35 s-labeled PCR cDNA fragments 
were then resolved on a DNA sequencing gel. A differential 
cDNA band (TCI) appeared to be present only in the tumor 
sample* This TCI cDNA fragment was recovered from the 
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sequencing gel and then reamplified with the same palindromic 
primer. This 636 bp fiayment is identified with a horizontal 

arrow in Fig. 2A. 

TCI gene expression was examined in colon carcinoma 
5 cells and in the corresponding adjacent colon tissue, and the 

results were as follows. Fig* 2B is a Northern Blot of three 
pairs of RNA from colon carcinoma (T) and their adjacent 
normal colon tissue (N; probed with 32 P-labeled TCI cDNA. TCI 
mRNA was over-expressed in all three cases of colon 
10 carcinoma, whereas only very weak TCI message appeared in the 

adjacent normal tissue. In the bottom panel of the blot, 
control RNA was blotted with 32 P-labeled cDNA encoding B- 
actin. 

Northern Blot analysis of several pairs of Tumor/Normal 

15 total RNA using a 32 P-labeled TCI cDNA probe revealed that the 

TCI mRNA size is about 3.6Kb. This first TCI cDNA fragment 
was cloned into a pBluescript plasmid DNA vector strategies. 
Nucleotide sequence analysis revealed that this fragment 
contained 63 6bp with nucleotide sequences corresponding to 

20 the primer sequence at both 5' -ends of the double-stranded 

DNA (Fig. 3 and SEQ ID NO: 1). The corresponding predicted 
amino acid sequence is shown in Fig. 3 and provided in SEQ 
ID NO: 2. A search of the GenBank database with this cDNA 
fragment revealed that TCI is a novel gene. 

25 Nucleotide sequence analysis of the 636bp TCI cDNA 

fragment obtained by the described differential display 
method revealed that it contained a partial open reading 
frame. Therefore, this 63 6bp cDNA fragment was used as probe 
to screen a cDNA library. Several overlapping clones were 

30 obtained and contained a 2997bp sequence. To obtain the 

complete open reading frame for TCI, a modified 5 '-end RACE 
technique was used to amplify the TCI coding regions. The 
nucleotide and deduced amino acid sequence of full-length TCI 
is shown in Fig. 4 and provided in SEQ ID NOS: 3 and 4. The 

35 N-terminal signal sequence is underlined; one predicted N- 

linked glycosylation site (NDT) is boxed and a 



WO 95/11923 



PCTYUS94/12502 



- 12 - 

polyadenylation signal (AATAAA) is indicated* The cDNA 
contains 312 6bp with a potential polyadenylation sequences 
(AATAAA) at the 3' -end, beginning at residue 2963. The open 
reading frame (ORF) encodes a 777-amino acid protein with a 
calculated molecular weight of 8 6kD. The TCI protein 
contains an amino-terminal signal peptide or secretory leader 
signal ( ALP AR I LALALALAL ) , and one predicted site of N-linked 
glycosylation at amino acid residue 605 (NDT) . One 
Cemokine B family motif (C-C) was found at amino acid residue 

85 (C-C) of TCI. 

Analysis of the deduced amino acid sequence (SEQ ID 
NO: 4) revealed that TCI contained four internal homologous 
domains of approximately 13 5 amino acids. A comparison of 
these repeats is shown in Fig. 5. Each boxed amino acid is 
identical with at least one other residue at that same 
position. The interdomain homologies range from 32% (between 
domains 2 and 4) to 18% (between domains 1 and 3). Some 

amino acid sequence such as TLF --NEAF, NGVIHXID are highly 

conserved between all four repeats. The notations and -~ 

v & 

are used herein to indicate that alanine or valine, and 
threonine or serine, respectively, may be found at these 
positions. In addition, the notation X is used herein to 
indicate that this position may include any amino acid. Each 
repeat starts with the most divergent sequence. The four 
repeats occur between residues 139-537 and are uninterrupted 
by non-homologous domains. A schematic representation of the 
four repeats of TCI is shown in Fig. 6A. The four homologous 
repeats suggest a tetrameric structure (Mclachlan 1980; Zinn 
et al, 1988) with two binding sites, one at each intrachain 
dimer. The four repeats of TCI may serve as ligand binding 
sites, with the N-terminal or C-terminal domains serving as 
the functional domain. One possible arrangement of the four 
repeated domains and the N- and C-terminal domains is shown 
schematically in Fig. 6B. 
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The nucleotide and corresponding amino acid sequence of 
the TCI gene and protein with a corrected leader signal 
sequence are given in Figs. 28A-28D, 

Palindromic PCR 

Described below is a novel technique used to identify 
the TCI mRNA and prepare TCI cDNA. Although the sequence of 
bases in a coding and antisense strand of a cDNA molecule 
are, in a sense, "mirror images" of one another, we have 
found that with surprising frequency a short sequence of 
bases ^ e.g. 9 or 10, in one strand will be found to have an 
exact copy in its anti-parallel strand. We call these 
sequences "palindromic" sequences. This phenomenon has been 
used to develop a method of cDNA isolation and amplification. 

In order to determine the frequency of occurrence or 
"palindromic frequency" of these anti-parallel repeats, a 
computer program called DNA Matrix (DNA Strider 1.2) was used 
to analyze double stranded cDNAs which were randomly selected 
from the GenBank database. DNA matrix analysis revealed the 
palindromic frequency of double strand cDNA to be 
surprisingly high and led to our development of a 
relationship between the number of bases in the chosen 
sequence, the "palindromic bases," and the palindromic 
frequency. Single strand cDNA (the mRNA strand) and its 
anti-parallel strand were compared, each from the 5' to 3' 
end by the DNA Matrix program. For example, as illustrated 
in Fig. 10, on the average, in every 202 bases of sequence 
in one strand of cDNA, there is one 9 -base sequence that is 
exactly duplicated to that in another region of its 
antiparallel strand. The palindromic frequency found in 
native cDNA is much higher than that which would be 
calculated from random composition, suggesting that the 
nucleotide composition of double-stranded cDNA follows 
certain palindromic rules. As shown in Fig. 11, the 
palindromic frequency dramatically decreases when the number 
of bases in the searched segment increases. The key numbers 
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10 



15 



20 



25 



30 



35 



of bases which lead to dramatic change of palindromic 
frequency are 9, 10 and 11 bases. This, then, is the 
theoretical basis for designing a primer for use in the DNA 
isolation and amplification method of the invention, 

palindromic PCR. 

Table 1 presents the statistical data showing the 
palindromic frequency related to the number of bases in the 
searched segment . 

Table 1 

Palindromic Frequency Related to 
No . of Bases in Searched Segment as 
Revealed by cDNA Matrix 
Analysis 



No. Bases in 
Searched Segment 
( X Bases ) 


Average Length to Find 
One X-Base Palindromic 
Sequence 


Palindromic Frequency 


7 bases 


18 bases 


0.4 


9 bases 


202 bases 


0.048 


11 bases 


872 bases 


0.015 


13 bases 


>1996 bases 


<0.007 



The principle of the method of palindromic PCR is shown 
in schematic representation in Fig. 12. The general strategy 
is to use a single primer and one enzyme combining both 
reverse transcriptase and DNA polymerase activities, e.g., 
rTth DNA polymerase (from the thermophilic eubacterium 
Thermus thermophilics) , to perform both reverse transcription 
and polymerase chain reaction in one reaction tube. rTth DNA 
polymerase possesses a very efficient reverse transcriptase 
activity in the presence of MnCl 2 and a thermostable DNA 
polymerase activity in the presence of MgCl 2 . The rTth DNA 
polymerase has been observed to be greater than 100-fold more 
efficient in coupled reverse transcription and PCR than the 
analogous DNA polymerase, Taq (T. W. Myers et al., 
Biochemistry 30:7661, 1991). In this reaction, an 

appropriate primer would allow anchored annealing to some 
regions of certain mRNA species that contain sequence 
complementary to the palindromic primer. This subpopulation 
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of mRNAs is likely to be reverse transcribed by rTth DNA 
polymerase. A "Palindromic" primer apparently has a greater 
probability of anchoring to the coding regions of mRNA than 
oligodT primer. Once mRNAs are reverse transcribed to form 
5 a first strand cDNA species, the same primer can anneal to 

some regions of the first strand cDNA and function as the 
"Downstream primer" in a PCR reaction. The same primer can 
also function as the "Upstream primer." When the primer 
anchors to first strand cDNAs, the annealing position to 

10 various cDNA molecules should, in principle, be at different 

distances in different molecules from the first annealing 
position. Therefore, the amplified cDNA fragments from 
various mRNAs will be of different sizes. Once these PCR- 
generated cDNA fragments are labeled with 35 S-dATP or 33 P- 

15 dATP, they can be resolved as a ladder by DNA sequencing 

gels. A display of cDNAs originating from various mRNAs can 
then be visualized after autoradiography. 

The selection of the specific palindromic primer depends 
on three important factors: the length, the GC content, and 

20 the sequence specificity. DNA Matrix analysis has indicated 

that the ideal length of a primer for an appropriate 
palindromic frequency is from 9 to 11 bases. Therefore, a 
set of prime-" from 8 base to 12 base in length with 50% GC 
content was chosen for study. Our results showed that 9, 10, 

25 and 11 base primers gave an appropriate number of cDNA 

fragments readily resolvable by DNA sequencing gels 
(Fig. 13) . To identify the GC content of the primer most 
suitable for this method, a set of 10-mer primers with GC 
content ranging from 40% to 90% was tested. The results 

30 suggested that a GC content from 40% to 80% is acceptable 

(Fig. 13) . However, primers with 40% to 60% GC content appear 
to yield better results. To examine the effect of the 
specific sequence of the primer, 10-mer primers of different 
sequences each having 50% GC content was tested. As 

35 predicted, different primers gave rise to different cDNA 
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patterns (Fig. 13), As little a difference as three bases 
led to totally different cDNA profiles. 

V; cDNA patterns generated by palindromic PCR are highly 

stable. When the same conditions were used but the 

■4 5 experiments repeated at different times , the patterns of the 

| amplified cDNA fragments were highly reproduced, indicating 

the reliability of this method. 

In order to be sure of detecting mRNAs with a low copy 
number, it was necessary to determine the sensitivity of this 
10 method. It has been reported that the amplification driven 

by rTth DNA polymerase is at least 100-fold greater than that 
by Taq polymerase, rTth DNA polymerase allows the detection 
of XL-la mRNA, which has a very low copy number, in 8 0pg of 
total cellular RNA (T.W. Myers et al., Biochemistry 30;7661, 

:| 15 1991). Thus, the higher efficiency of rTth DNA polymerase 

ensures that the palindromic PCR method of the invention 
provides high sensitivity. In addition, because rTth 
polymerase is thermostable, it can also be used to perform 
several RT cycles (reverse transcription cycles) , which means 
2 0 several copies of first strand cDNA can be obtained from a 

single copy of mRNA. The sensitivity of the method is 
increased by performing multiple RT cycles using rTth 
polymerase (Fig. 13). 

The method of the invention was tested in a search for 
25 differences in mRNA expression between human colon carcinoma 

and the adjacent normal epithelium from a surgical specimen. 

$ Paired ^^RNA preparations were reverse transcribed with a 

palindromic primer 5 ' -CTGATCCATG (designated as PP-1 primer) 
(SEQ ID NO: 16) in the presence of MnCl 2 followed by PCR with 
30 the same primer in the presence of MgCl 2 using rTth DNA 

polymerase. The reaction products were then analyzed by DNA 
sequencing gels. About 70-110 amplified cDNA fragments 
ranging from 100-700 bases from both preparations were 
detected (Fig. 14) . Whereas overall cDNA patterns between 

■K 35 tumor and normal tissue are similar, significant differences 

were detected by this method. Most cDNA bands showed the 
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same intensity between tumor and normal preparations, but two 
cDNA bands designated as TCI and TC2 appeared with increased 

% intensities in tumor tissue (Fig* 14), A sample reaction 

protocol is described below. 

| 5 To 1.1 jul of double distilled (dd) H 2 0 is added 0*5 Ml 

^ of 10X rTth DNA polymerase reverse transcriptase (RT) buffer 

(lOOmM Tris-Hcl, pH 8*3, 900mM KC1) , 0.5 Ml of lOmM MnCl 2 , 
0.4 m1 of 2.50mM dNTP, j. . 0 m! (0.50 fig) of one palindromic 
primer (9-11 mer) , and 1.0 jil (100 ng) of mRNA to form Mix 
10 A in a total vol. of 4.5 Ml* Mix A is heated in a 0.5 ml PCR 

tube at 65 °C for 6 min and then at 37 °C for 8 min. Next, 0.5 
/xl (1 .25 unit) of rTth DNA polymerase is added, the reaction 
mixture is mixed well, spun briefly, incubated at 70°C for 
12 min and then placed on ice. Mix B which consisting of 

f 15 12.5 Ml of dd H 2 0, 2.0 Ml of 10X chelating buffer (50% 

glycerol (v/v) , lOOmM Tris-HCl, pH 8.3, 1M KC1, 0.5% Tween 
20), 2.0 Ml °f 25 M 9 c l2 solution, 2.50 mM dNTP and 2.0 m! 
of 35 S-dATP (or 33 P~dATP) is dispensed in the amount of 20 /il 
into each 5.0 m! RT reaction mixture. The samples are mixed 
20 and spun briefly and then overlaid with 25 m! of mineral oil. 

The polymerase chain reaction is then started: 94 °C for 40 
sec, 40°C for 2 min., 72°C for 35 sec. (for 40 cycles, hold 
at 72°C for 4 min.), and then 4°C. 

For cDNA analysis, 7 Ml of a PCR sample is mixed with 
25 4 Ml of sequencing loading buffer, samples are incubated at 

80°C for 3 min., and then placed on ice. 4.5 m! of the 
sample is loaded on a 6%-8% agarose DNA sequencing gel. 

A gel slice containing a desirable cDNA band (such as 
TCI) was soaked in 2 00 m! of ddH 2 0 for 20 min and then 
30 separated from 3M paper with a clean forcep or a plastic 

pipette tip. The gel was removed and pounded with an 
autoclaved plastic pipette tip. Elution buffer (20 Ml) was 
added and the mixture was vortexed and left at room 
temperature for 4 hrs or overnight. After centrif ugation, 
35 cDNA fragments in 10 m! eluent were reamplified by rTth DNA 

polymerase with the same palindromic primer, as described. 
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After one 40-cycle PCR, the reamplified cDNA could be 
detected by agrose gels stained with ethidium bromide. The 
amount of cDNA generated was sufficient for cloning and 
preparing a probe for Northern Blot analysis. Fig, 15 shows 
£ 5 the gel obtained when the TCI cDNA band was subjected to 

1 elution and reamplif ication. Total PCR product of the TCI 

fragment was 2.5 isq. 

The reamplified TCI cDNA fragment was treated with T4 
DNA polymerase and cloned into pBluescript plasmid DNA vector 
10 at Smal site by blunt end ligation. The nucleotide sequence 

of the TCI fragment (SEQ ID NO : 1 ) showed that a sequence 
identical to the PP1 primer (SEQ ID NO: 16) is indeed present 
at the 5'-end of both strands of the TCI fragment (Fig. 16) . 
This result confirms that the 5 '-ends of both complementary 
% 15 chains of the TCI cDNA fragment used the same palindromic 

^ primer during palindromic PCR as discussed above. It also 

implies that the same palindromic primer sequence is present 
at the 5 ' -ends of both strands for every PCR product in the 
same reaction. These results establish that a single 9- 
20 11 base palindromic primer can effectively prime reverse 

transcription and then serve as both a "Downstream primer" 
and an "Upstream primer" in palindromic PCR amplification. 

The method of the invention differs from other methods 
in a number of ways. In palindromic PCR, only a single 
25 primer (9-11 bases) is used and is sufficient to prime 

reverse transcription as well as :o support subsequent PCR 
^ for a display of nearly 100 cDNA species. Because the 

pattern of amplified cDNAs depends on the sequence of the 
single palindromic primer, the species of mRNAs that are 
3 0 subjected to amplification can readily be controlled by a 

proper sequence of the palindromic primer. If a group or 
family of genes shares certain sequences, a primer can be 
chosen from such a sequence, and a specific display of this 
set of mRNAs can readily be performed. Likewise, computer 
>*: 35 analysis of the Genebank database may reveal additional 

sequences useful as a primer shared by a set of related 
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genes. The use of such a primer by the method of the 
invention would allow the display for the expression of a 
given set of genes. Palindromic PCR provides an easy, 
sensitive and economical way to identify and isolate 
5 differentially expressed genes related to tumor and other 

disease. 

Differential expression of TCI DNA in normal tissue 
and tumor cells. 

Northern Blot analysis , as described above, confirmed 

10 the differential expression of TCI mRNA in colon carcinoma 

tissue, and the absence of TCI mRNA in the corresponding 
normal tissue. Evaluation of the expression of TCI mRNA in 
additional cases of colon carcinoma at different stages was 
also undertaken. Surgical specimens of 24 cases of human 

15 primary colon carcinoma and 6 cases of liver metastases were 

examined by Northern hybridization of total RNA with 32 P- 
labeled TCI probe. 

A Northern Blot of 24 pairs of colon carcinoma (T) and 
their adjacent normal tissue (N) probed with 32 P-labeled TCI 

20 cDNA is shown in Fig. 17. It is evident from the results 

that the level of TCI mRNA in tumor tissue is much greater 
than the level in adjacent normal tissue in all 24 cases. 
The TCI mRNA levels vary in different cases of carcinoma. 
Panels I and II show A: TCI mRNA and B: Control; Panel III 

25 shows TCI mRNA and control (Actin) mRNA. 

Fig. 18 shows the Tumor/Normal RNA Ratio from Northern 
Blot results of Fig. 17. The horizontal line indicates the 
mean Tumor/Normal ratio. TCI mRNA was abundantly expressed 
in all 24 cases of primary colon carcinoma and 6 cases of 

30 liver metastases, whereas only a small amount of TCI mRNA was 

detected in a few cases of paired adjacent normal tissue. 
The mRNA level of TCI was much greater in primary colon 
carcinoma than in paired adjacent normal colonic epithelium 
in all 24 cases. The Tumor/Normal ratio varied from 5.6 to 

35 92, and the mean Tumor/Normal ratio being 32. The 

Tumor/Normal ratio, when plotted against the Duke's stage of 
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disease, gave evidence for increasing TCI expression with 
increasing stage of colon carcinoma* 
M In all six cases of paired colon carcinoma metastatic 

to liver , the TCI mRNA level was much higher in metastatic 
$ 5 tumor than in adjacent normal liver tissue. Fig, 19 shows 

I a Northern Blot of RNA from metastatic colon carcinoma to 

liver (LM) and their adjacent normal liver (NL) probed with 
32 P-labeled TCI cDNA. TCI mRNA was expressed only in 
metastatic tumor in 5 of 6 samples. Only one sample of 
10 normal liver tissue expressed a very weak TCI message. The 

Tumor/Normal ratio is greater than 64 . These results 
suggested that differential expression of TCI may be 
associated with human colorectal cancer progression and 
biological aggressiveness of the disease . 

% 

^ 15 In vivo and in vitro expression of TCI mRNA 

The expression of TCI mRNA in cultured cancer cells and 
in vivo tumor cells was analyzed and is described below. TCI 
was overexpressed in tumor tissue in vivo. The expression 
of TCI mRNA in cultured cancer cell lines in vitro was 

20 examined by Northern Blot analysis. RNAs isolated from 

twelve colon cancer cell lines (HT29, Clone A, MIP101, CX-1, 
Morser, CCL227, CCL228, etc.) derived from different stage 
of human colon carcinoma, two melanoma cell lines (LOX, 
A2058), one breast cancer cell line (MCF-7) , two cervical 

25 cancer cell lines (Hela, A431) , three bladder cancer cell 

lines (EJ, T24, MB49) , one pancreas cancer cell line 
(CRL1420) , two hepatoma cell lines (HepG2 / HepG3) and four 
normal cell lines (FS-2, MRC-5, 498A, CV-1) were screened by 
^ Northern Blot analysis. However, the TCI transcript could 

30 not be detected in all of these cell lines. This result 

suggested that TCI expression was dramatically decreased or 
indeed turned off in cultured cancer cells. However, after 
cultured cancer cells were injected into nude mice to grow 
& tumor in vivo, TCI mRNA expression turned on again and its 

35 mRNA level could be detected by Northern Blot analysis. 
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Fig. 2 0 shows a Northern Blot of RNA from breast cancer 
cell line MCF-7 (1) and colon cancer cell line CX-1 (2) 
cultured in vitro, and MCF-7 tumor (3) and CX-1 tumor (4) 
grown in vivo in nude mice. TCI mRNA in colon cancer cell 
line CX-1 and breast cancer cell line MCF-7 cultured in vitro 
could not be detected by Northern Blot analysis. After 
cultured CX-1 and HT2 9 cells were injected into nude mice to 
form tumors in vivo, TCI mRNA was detectable by Northern Blot 
analysis, the TCI mRNA levels being dramatically increased 
in vivo. This result suggests that TCI gene expression was 
turned on or dramatically increased in the tumor cells in 
vivo. Thus, the differential expression of the TCI gene 
appears to be related to invasion and metastasis of tumor 
cells in vivo. The regulation of TCI gene expression in vivo 
and in vitro could be a very important model to understand 
tumor igenes is and tumor malignant behavior. 

Expression of TCI protein 

The expression of TCI protein in in vivo tumor cells, 
cultured carcinoma cells, and in corresponding normal cells 
20 was examined, and is described below. The TCI gene (SEQ ID 

NO: 3) was cloned into a plasmid expression vector, and 
recombinant TCI protein (SEQ ID NO: 4) was expressed in 
bacteria. Several monoclonal antibodies against the 

bacterially-produced TCI protein were raised, as will be 
25 described below. A variety of formalin-fixed and paraffin- 

embedded tumor tissue sections were examined by 
immunohistochemical staining with a mouse monoclonal anti-TCl 
antibody anti-TCl-1 using an avidin-biotinylated-peroxidase 
detection technique. Strong positive staining of TCI was 
30 found in primary colon carcinoma (Fig. 21, panel A) , colon 

carcinoma metastatic to liver (Fig. 21, panel C) and lymph 
node (Fig. 21, panel D) , breast carcinoma (Fig. 22, 
panels A,C,D) and gastric carcinoma (Fig. 23, panels A, B) . 
The TCI protein level in tumor tissue is much greater than 
35 the level of TCI in adjacent normal tissue (Figs. 21B, 23B) . 
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These results, which represent the staining of 
sixteen cases of difr^ent stages of primary colon carcinoma, 
eight cases of colon tumor metastatic to liver and lymph 
node, fourteen cases of breast carcinoma and five cases of 
5 gastric carcinoma, suggested the following three conclusions. 

First, the TCI protein level appeared to foe different in 
different types of carcinoma, with protein levels being 
highest in breast carc'aoma. Second, the advance edge of the 
deeper invasive tumor appeared stain stronger for TCI, 
10 suggesting a greater prevalence of TCI protein at the advance 

edge of the tissue. Third, the move advanced stages of tumor 
appeared to contain more TCI protein. 

Fig* 24 is a Western Blot analysis of protein samples 
from two pairs of colon carcinoma and their adjacent normal 
15 colon (A) and two pairs of breast carcinoma and their 

adjacent normal breast (B) , using a monoclonal antibody 
against TCI protein as a probe. A major 86kd protein (arrow) 
was detected by anti-TCl antibody in tumor samples (T) but 
not in normal samples (N) . The Western Blot analysis 
20 confirmed that tumor tissue contained significantly more TCI 

protein than the corresponding adjacent normal tissue. 

TCI gene expression 

The presence of TCI mRNA and protein in malignant 
mesothelioma cells was examined, and is described below. 
More than 4 2 cell lines have been screened for TCI gene 
expression by Northern Blot analysis. However, only two cell 
lines, JMN1B and JMN, express detectable mRNA by Northern 
Blot analysis. JMN1B and JMN are malignant mesothelioma, 
JMN1B being a subline of JMN cells with showing enhanced 
tumorigenicity after passage of JMN cells through a nude 
mouse. Fig. 25 is a Northern Blot analysis of RNA from 
malignant mesothelioma cells JMN1B and JMN using TCI cDNA as 
a probe. The results presented in Panel 2 5B demonstrate that 
TCI mRNA level in JMN1B cells (2) is much greater than that 
in JMN cells (1) . Panel 25A shows the ethidium bromide 
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staining pattern of an RNA gel in which the same amount of 
JMN (1) and JMN1B (2) RNA is loaded per lane. 

The Northern Blot analysis revealed that the TCI mRNA 
level is much higher in JMN1B than in JMN, the JMN1B/JMN 
5 ratio being approximately 14. Higher expression of TCI mRNA 

in JMN1B could be related to the observed greater 
tumorigenicity of JMN1B cells. It has been found that JMN1B 
cells can secrete an "EGF-like" growth factor called 
transformed mesothelial growth factor (TMGF) that satisfies 

10 the EGF requirement of normal human mesothelial cells. The 

difference in the levels of TCI mRNA in JMN1B and JMN cells 
provides an ideal cell model to understand the regulation of 
TCI expression and its relation to tumorigenicity. 

Sequence analysis of the deduced amino acid sequence has 

15 revealed that the TCI protein (SEQ ID NO: 4) contained a 

secretory leader signal at its N-terminus. The secretion of 
TCI protein was confirmed by Wester Blot analysis of 
conditioned medium of JMN1B cells. JMN1B cells were cultured 
in regular medium until 90% confluent, then cultured in serum 

20 free medium for two days. This serum free conditioned medium 

was analyzed by immunoblotting with anti-TCl monoclonal 
antibody. Fig. 26 is a Western blot analysis using a 
monoclonal antibody against TCI to probe JMN1B cells grown 
in conditioned medium and whole cell lysate. Two major bands 

25 (about 86kd and 104kd) were recognized by anti-TCl antibody 

both in JMN1B cell conditioned iiiedium (1) and whole cell 
lysate (2 ) . Numbers on the left indicate the position of 
molecular weight standards in kilodalton. The protein size 
of the lower molecular weight 86kd band is consistent with 

30 that of deduced TCI protein, whereas the higher molecular 

weight 104kd band is consistent with a TCI glycoprotein. 
There is one predicted site of N-linked glycosylation at the 
amino acid residue 605 (NDT) of deduced TCI protein sequence. 
There are 60 threonine residues and 36 serine residues in the 

35 deduced TCI sequence, each of which is a potential site of 

O-linked glycosylation. 
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Human malignant mesothelioma cell line JMN1B can express 
abundant TCI. This cell line was used to study the 
distribution and localization of TCI protein by 
immunof luorescent staining with an anti-TCl monoclonal 
antibody followed by Rhodamine conjugated goat anti-mouse IgG 
secondary antibody. When JMN1B cells were fixed with 
paraformaldehyde without subsequent permeabilization, the 
positive staining was seen on the cell surface or outside of 
the cell (Fig. 27, panels A,B), which confirms the secretion 
of TCI protein. When JMN1B cells were fixed with 
paraformaldehyde and then permeabilized, positive staining 
appeared in the Golgi complex and the endoplasmic reticulum 
(ER) in the cell (Fig. 27, panels C,D), suggesting that TCI 
protein is synthesized in the ER and Golgi complex. The 
staining in the Golgi complex is clearly evident, indicating 
that glycosylation of TCI protein may be located in the Golgi 
complex. The TCI protein distribution pattern also suggests 
that TCI is a secreted glycoprotein. 

Without being bound to one theory as to the biological 
function of TCI, observations as to the prevalence and 
expression of TCI mRNA and protein indicate that TCI may be 
related to tumor malignant behavior such as invasion and 
metastases. These observations include the following: TCI 
is significantly overexpressed in tumor tissue; TCI is a 
secreted protein; later stage tumor expresses higher levels 
of TCI; deeper invasive tumor contains higher levels of TCI 
protein; TCI expression turns off in cultured tumor cells in 
vitro and turns on again after cells grow tumor tissue in 
vivo. These observations indicate that the function of TCI 
is not related to tumor cell proliferation, but is more 
likely involved in tumor malignant behavior in vivo, such as 
invasion and metastases. 

TCI is a member of a Family of Proteins. 

A FA ST A search of the GenBank and EMBL database with the 
TCI open reading frame indicated that the protein is unique. 
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However, TCI whole protein shared 4 5% sequence identity with 
a TGF-beta inducible gene, Big-h3 , at the amino acid level, 
suggesting that TCI and Big-h3 may belong to a new gene 
family. The identity between TCI (SEQ ID NO: 4) and Big-h3 

:| 5 (SEQ ID NO: 11) at the amino acid level is shown in Fig. 7. 

I In Fig. 7, identical amino acids between TCI and Big-h3 are 

boxed. Several stretches of amino acids GSFTXFAPSNEAW, 
TLXAPTNEAFEKXP , ATNGWHXIDXV, LYXGQXLETXGGKXLRVFVYR, 
HYPNGXVTVNCAR are highly conserved between TCI and Big-h3. 
10 Northern Blot analysis showed that the TCI gene is expressed 

from a larger transcript than Big-h3 , and DNA sequence 
analysis indicated that TCI contains a longer open reading 
frame encoding a higher molecular weight protein than the 
Big~h3 gene. It has been found that Big-h3 also contains 

I is four internal repeats. The amino acid sequence homology and 

structural similarity between TCI and Big-h3 indicate their 
functional similarity and relationship. We found that Big-h3 
mRNA is also much more abundant in colon carcinoma tissue 
than in adjacent normal colon tissue (Fig. 7) . Fig. 8 is 
20 a Northern Blot of five pairs of RNA from colon carcinoma (T) 

and adjacent normal colon tissue (N) probed with 32 P-labeled 
Big-h3 cDNA. The blot shows Big-h3 mRNA level in colon 
carcinoma to be much higher than that in adjacent normal 
tissue. The bottom panel represents control RNA probed with 
25 32 P-labeled B-actin. 

In contrast to the expression pattern of TCI mRNA, which 

^ is shown to be largely restricted to in vivo tumor tissue, 

Big-h3 mRNA is not only expressed in the tumor tissue, but 
also expressed in the cultured tumor cell lines and some 
30 normal cell lines. Though TCI and Big-h3 shared significant 

homology, their responses to growth factors are distinctly 
different. 

Fasciclin I, II, III are extrinsic membrane 
glycoproteins involved in the growth cone guidance during 
^ 35 nervous system development in the insect embryo . A search 

of NBRF protein database revealed a significant homologous 
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domain between TCI and Fasciclin I from Grasshopper and 
Drosophila. One TCI domain of 2 04 amino acids (amino acid 
residue 503-706) shared 30% identity with Grasshopper 
Fasciclin I, and shared 25% identity with Drosophila 
5 Fasciclin. Fig. 9 shows amino acid sequence homology between 

TCI (SEQ ID NO: 4) and Fasciclin I from Grasshopper (GrF) (SEQ 
ID NO: 18) and Drosophila (DrF) (SEQ ID N0:19). Boxed amino 
acids are identical with at least one other amino acid at 
that same position. 
10 It has been found that Fasciclin also contained four 

internal homologous domains, each consisting of approximately 
150 amino acids. The domains of TCI and Fasciclin I share 
some highly conserved amino acid stretches such as 

TXF— PTNXAF, and VXHWDXXLXP. 

I A 

15 The most conserved sequence among TCI, Big-h3 and 

Fasciclin is TXF^PTNXA—. All four internal repeats in TCI 

V w 

or Big-h3 or Fasciclin I also share the most conserved 

A T F 

sequence TXF — P— NXA — . This sequence appears to be an 
important motif of this gene family. 

20 Screening for antagonists to TCI function. 

The invention also includes methods of screening for 
agents which inhibit TCI gene expression, whether such 
inhibition be at the transcriptional or translational level. 
Screening methods, according to the invention, for 
25 agents which inhibit expression of the TCI gene in vitro will 

include exposing a metastatic cell line in which TCI mRNA is 
detectable in culture to an agent suspected of inhibiting 
production of the TCI mRNA; and determining the level of TCI 
mRNA in the exposed cell line, wherein a decrease in the 
3 0 level of TCI mRNA after exposure of the cell line to the 

agent is indicative of inhibition of TCI mRNA production. 
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Alternatively, such screening methods may include in 

vitro screening of a metastatic cell line in which TCI 
M protein is detectable in culture to an agent suspected of 

inhibiting production of the TCI protein; and determining the 
: U 5 level of TCI protein in the cell line, wherein a decrease in 

U the level of TCI protein after exposure of the cell line to 

the agent is indicative of inhibition of TCI protein 

production. 

The invention also encompasses in vivo methods of 
10 screening for agents which inhibit expression of the TCI 

gene, comprising 

exposing a mammal having tumor cells in which TCI mRNA or 
protein is detectable to an agent suspected of inhibiting 
production of TCI mRNA or protein; and determining the level 
§ 15 of TCI mRNA or protein in tumor cells of the exposed mammal* 

A decrease in the level of TCI mRNA or protein after exposure 
of the mammal to the agent is indicative of inhibition of TCI 
gene expression. 

According to the invention, agents can be screened in 

2 0 vitro or in vivo as follows. For in vitro screening, a 

metastatic cell line, e.g., JMN1B, may be cultured in vitro 
and exposed to an agent suspected of inhibiting TCI 
expression in an amount and for a time sufficient to inhibit 
such expression. For in vivo screening, a mammal afflicted 
25 with a late stage cancer, particularly one of breast cancer, 

colon cancer, or cancer of the gastrointestinal tract, is 
^ exposed to the agent at a dosage and for a time sufficient 

to inhibit expression of TCI. A late stage cancer is defined 
by the Duke's stage of the cancer; i.e., late stage cancers 

3 0 correspond to Duke's stages 3-4. The amount or dosage of the 

agent which is effective to inhibit TCI expression may be 
determined using serial dilutions of the agent. The level 
of TCI mRNA or protein may be determined using an aliquot of 
cells from the cell culture or the in vivo tumor and 
35 performing Northern Blot analysis or Western Blot analysis, 

respectively. The agent will be considered inhibitory if the 
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level of TCI mRNA or protein decreases by more than 50%, and 
preferably more than 70-80% , relative to the same cell line 
which has not been exposed to the agent. 

Examples of potential inhibitors of TCI mRNA or protein 
^ 5 production include but are not limited to antisense RNA, 

m competitive inhibitors of the TCI protein such as fragments 

of the TCI protein itself, or antibodies to TCI protein. 
Candidate TCI inhibitory fragments include, but are not 

limited to, T-F — P— N — A ™ and NG-^HX-^™- 
iillULfciU ' Y V S D W A V V F 

10 Use of anataaonists to TCI functions. 

The invention also encompasses the treatment of late 
stage cancers by administration to a mammal afflicted with 
a late stage cancer one or more of the above-selected 
inhibitory agents. Late stage cancers, particularly those 

15 of the breast, colon, or gastrointestinal tract, are treated 

according to the invention by administering the inhibitory 
agent to a mammal afflicted with a late stage cancer in an 
amount and for a time sufficient to decrease the level of TCI 
protein or mRNA . 

2 0 The mode of administration may be intravenously, 

intraperitoneally , by intramuscular or intradermal injection, 
or orally. Administration may be by single dose, or may be 
continuous or intermittent. The dosage of inhibitory agent 
is that dosage which is effective to inhibit TCI production, 
25 i.e., within the range of 10 /xg/kg body weight - 100 gm/kg 

body weight, preferably, within the range of 1 mg/kg body 
weight - 1 gm/kg body weight, most preferably 10-100 mg/kg 

'}'K 

body weight. 

Production of monoclonal antibodies reactive with TCI. 

3 0 An anti-TCl antibody is produced according to Kohler and 

Milstein, Nature, 256:495-497 (1975), Eur. J. Immunol. 6:511- 
* 519 (1976) , both of which are hereby incorporated by 

reference, using the TCI protein or a fragment thereof as the 
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immunizing antigen. Hybridomas produced by the above process 
are selected for anti-TCl antibodies using the TCI as an 
antigen in an ELISA assay. The single type of immunoglobulin 
produced by a such a hybridoma is specific for a single 
5 antigenic determinant, or epitope, on the TCI antigen. 

Certain TCl-specific antibodies, for example, anti-TCl-1 
produced by the hybridoma deposited with the American Type 
Culture Collection (ATCC) under the ATCC number HB 11481, are 
unique in that they recognize the TCI protein, more 
10 specifically an epitope of the TCI protein, in formaldehyde- 

fixed and paraffin-embedded tumor cells which bear TCI. 

Deposits 

The following samples were deposited on October 29, 

1993, with the American Type Culture Collection (ATCC), 

15 12301 Parklawn Drive, Rockville, MD 20852. 

Deposit ATCC Accession No. 

TCI gene in pBluescript 75599 
plasmid DNA vector 

Hybridoma TC-1 HB 11481 

20 Applicants' assignee, Dana-Farber Cancer Institute, 

Inc. , represents that the ATCC is a depository affording 
permanence of the deposit and ready accessibility thereto by 
the public if a patent is granted. All restrictions on the 
availability to the public of the material so deposited will 

25 be irrevocably removed upon the granting of a patent. The 

material will be available during the pendency of the patent 
application to one determined by the Commissioner to be 
entitled thereto under 37 CFR 1,14 and 35 USC 122. The 
deposited material will be maintained with all the care 

30 necessary to keep it viable and uncontaminated for a period 

of at least five years after the most recent request for the 
furnishing of a sample of the deposited microorganism, and 
in any case, for a period of at least thirty (30) years after 
the date of deposit or for the enforceable life of the 
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patent, whichever period is longer. Applicants' assignee 
acknowledges its duty to replace the deposit should the 
depository be unable to furnish a sample when requested due 
to the condition of the deposit. 

OTHER EMBODIMENTS 
The invention is not limited to those embodiments 
described herein, but may encompass modifications and 
variations which do not depart from the spirit of the 
invention. While the invention has been described in 
connection with specific embodiments thereof, it will be 
understood that further modifications are within the scope 
of the following claims. 
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5 SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Chen, Lan Bo 

Bao, Shideng 
Liu, Yuan 

10 (ii) TITLE OF INVENTION: A NOVEL TUMOR MARKER AND NOVEL METHOD OF 

ISOLATING SAME 
(iii) NUMBER OF SEQUENCES: 19 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Weingarten, Schurgin, Gagnebin & Hayes 
15 (B) STREET: Ten Post Office Square 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02109 

2 0 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0 , Version #1*25 
25 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : US 08/146,488 

(B) FILING DATE: 29-OCT-1993 

(C) CLASSIFICATION: 
(viii) ATTORNEY / AGENT INFORMATION: 

30 (A) NAME: Heine, Holliday C* 

{ B ) REGISTRATION NUMBER: 34,346 

(C) . ^FERENCE / DOCKET NUMBER: DFCI-333XX 
<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 543-2290 
35 ( B) TELEFAX: (617) 451-0313 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 636 base pairs 

(B) TYPE: nucleic acid 

* 40 (C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
45 (ix) FEATURE: 

(A) NAME/KEY: CDS 

( B ) LOCATION: 1..636 
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(xi) SEQUENCE DESCRIPTION; SEQ ID NOsl: 
CTG ATC CAT GGG AAC CAG ATT GCA ACA AAT GGT GTT GTC CAT GTC ATT 
Leu He His Gly Asn Gin He Ala Thr Asn Gly Val Val His Val He 

15 10 15 

GAC CGT GTG CTT ACA CAA ATT GGT ACC TCA ATT CAA GAC TTC ATT GAA 
Asp Arg Val Leu Thr Gin He Gly Thr Ser He Gin Asp Phe He Glu 

20 25 30 

GCA GAA GAT GAC CTT TCA TCT TTT AGA GCA GCT GCC ATC ACA TCG GAC 
Ala Glu Asp Asp Leu Ser Ser Phe Arg Ala Ala Ala He Thr Ser Asp 

35 40 45 

ATA TTG GAG GCC CTT GGA AGA GAC GGT CAC TTC ACA CTC TTT GCT CCC 
He Leu Glu Ala Leu Gly Arg Asp Gly His Phe Thr Leu Phe Ala Pro 

50 55 60 

ACC AAT GAG GCT TTT GAG AAA CTT CCA CGA GGT GTC CTA GAA AGG ATC 
Thr Asn Glu Ala Phe Glu Lys Leu Pro Arg Gly Val Leu Glu Arg lie 
65 70 75 80 

ATG GGA GAC AAA GTG GCT TCC GAA GCT CTT ATG AAG TAC CAC ATC TTA 
Met Gly Asp Lys Val Ala Ser Glu Ala Leu Met Lys Tyr His lie Leu 

85 90 95 

AAT ACT CTC CAG TGT TCT GAG TCT ATT ATG GGA GGA GCA GTC TTT GAG 
Asn Thr Leu Gin Cys Ser Glu Ser He Met Gly Gly Ala Val Phe Glu 

100 105 110 

ACG CTG GAA GGA AAT ACA ATT GAG ATA GGA TGT GAC GGT GAC AGT ATA 
Thr Leu Glu Gly Asn Thr He Glu He Gly Cys Asp Gly Asp Ser He 

115 120 125 

ACA GTA AAT GGA ATC AAA ATG GTG AAC AAA AAG GAT ATT GTG ACA AAT 
Thr Val Asn Gly He Lys Met Val Asn Lys Lys Asp lie Val Thr Asn 

130 135 140 

AAT GGT GTG ATC CAT TTG ATT GAT CAG GTC CTA ATT CCT GAT TCT GCC 
Asn Gly Val He His Leu He Asp Gin Val Leu He Pro Asp Ser Ala 
145 150 155 160 

AAA CAA GTT ATT GAG CTG GCT GGA AAA CAG CAA ACC ACC TTC ACG GAT 
Lys Gin Val He Glu Leu Ala Gly Lys Gin Gin Thr Thr Phe Thr Asp 

165 170 175 

CTT GTG GCC CAA TTA GGC TTG GCA TCT GCT CTG AGG CCA GAT GGA GAA 
Leu Val Ala Gin Leu Gly Leu Ala Ser Ala Leu Arg Pro Asp Gly Glu 

180 185 190 

TAC ACT TTG CTG GCA CCT GTG AAT AAT GCA TTT TCT GAT GAT ACT CTC 
Tyr Thr Leu Leu Ala Pro Val Asn Asn Ala Phe Ser Asp Asp Thr Leu 

195 200 205 

AGC ATG GAT CAG 
Ser Met Asp Gin 
210 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 212 amino acids 

(B) TYPE: amino acid 
5 (D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
Leu lie His Gly Asn Gin He Ala Thr Asn Gly Val Val His Val He 
15 10 15 

10 Asp Arg Val Leu Thr Gin He Gly Thr Ser He Gin Asp Phe He Glu 

20 25 30 

Ala Glu Asp Asp Leu Ser Ser Phe Arg Ala Ala Ala He Thr Ser Asp 

35 40 45 

He Leu Glu Ala Leu Gly Arg Asp Gly His Phe Thr Leu Phe Ala Pro 
15 50 55 60 

Thr Asn Glu Ala Phe Glu Lys Leu Pro Arg Gly Val Leu Glu Arg He 
65 70 75 80 

Met Gly Asp Lys Val Ala Ser Glu Ala Leu Met Lys Tyr His He Leu 

85 90 95 

20 Asn Thr Leu Gin Cys Ser Glu Ser lie Met Gly Gly Ala Val Phe Glu 

100 105 110 

Thr Leu Glu Gly Asn Thr He Glu He Gly Cys Asp Gly Asp Ser He 

115 120 125 

Thr Val Asn Gly He Lys Met Val Asn Lys Lys Asp He Val Thr Asn 
25 130 135 140 

Asn Gly Val He His Leu He Asp Gin Val Leu He Pro Asp Ser Ala 
145 150 155 160 

Lys Gin Val He Glu Leu Ala Gly Lys Gin Gin Thr Thr Phe Thr Asp 

165 170 175 

30 Leu Val Ala Gin Leu Gly Leu Ala Ser Ala Leu Arg Pro Asp Gly Glu 

180 185 190 

Tyr nir Leu Leu Ala Pro Val Asn Asn Ala Phe Ser Asp Asp Thr Leu 

195 200 205 

Ser Met Asp Gin 
35 210 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3126 base pairs 
{ B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : single 

{ D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: CDNA 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 
( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 43*. 2376 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCCACCATGT AGCCCGCGTC ACCGTTCTGC GCATTCCGCA GC ATG GCT CTG CCT 

Met Ala Leu Pro 
1 

GCC CGA ATC CTC GCT CTG GCC CTC GCA CTG GCG CTC GGA CCC GCC GTG 
Ala Arg He Leu Ala Leu Ala Leu Ala Leu Ala Leu Gly Pro Ala Val 
5 10 15 20 

ACA CTG GCC AAC CCG GCG AG A ACG CCG TAC GAG CTG GTA CTC CAG AAG 
Thr Leu Ala Asn Pro Ala Arg Thr Pro Tyr Glu Leu Val Leu Gin Lys 

25 30 35 

AGC TCG GCA CGA GGG GGT CGG GAC CAA GGC CCA AAT GTC TGT GCC CTT 
Ser Ser Ala Arg Gly Gly Arg Asp Gin Gly Pro Asn Val Cys Ala Leu 

40 45 50 

CAA CAG ATT TTG GGC ACC AAA AAG AAA TAC TTC AGC ACT TGT AAG AAC 
Gin Gin He Leu Gly Thr Lys Lys Lys Tyr Phe Ser Thr Cys Lys Asn 

55 60 65 

TGG TAT AAA AAG TCC ATC TGT GGA CAG AAA ACG ACT GTG TTA TAT GAA 
Trp Tyr Lys Lys Ser He Cys Gly Gin Lys Thr Thr Val Leu Tyr Glu 

70 75 80 

TGT TGC CCT GGT TAT ATG AGA ATG GAA GGA ATG AAA GGC TGC CCA GCA 
Cys Cys Pro Gly Tyr Met Arg Met Glu Gly Met Lys Gly Cys Pro Ala 
85 90 95 100 

GTT TTG CCC ATT GAC CAT GTT TAT GGC ACT CTG GGC ATC GTG GGA GCC 
Val Leu Pro lie Asp His Val Tyr Gly Thr Leu Gly lie Val Gly Ala 

105 110 115 

ACC ACA ACG CAG CGC TAT TCT GAC GCC TCA AAA CTG AGG GAG GAG ATC 
Thr Thr Thr Gin Arg Tyr Ser Asp Ala Ser Lys Leu Arg Glu Glu He 

120 125 130 

GAG GGA AAG GGA TCC TTC ACT TAC TTT GCA CCG AGT AAT GAG GCT TGG 
Glu Gly Lys Gly Ser Phe Thr Tyr Phe Ala Pro Ser Asn Glu Ala Trp 

135 140 145 

GAC AAC TTG GAT TCT GAT ATC CGT AGA GGT TTG GAG AGC AAC GTG AAT 
Asp Asn Leu Asp Ser Asp He Arg Arg Gly Leu Glu Ser Asn Val Asn 

150 155 160 

GTT GAA TTA CTG AAT GCT TTA CAT AGT CAC ATG ATT AAT AAG AGA ATG 
Val Glu Leu Leu Asn Ala Leu His Ser His Met He Asn Lys Arg Met 
165 170 175 180 

TTG ACC AAG GAC TTA AAA AAT GGC ATG ATT ATT CCT TCA ATG TAT AAC 
Leu Thr Lys Asp Leu Lys Asn Gly Met He He Pro Ser Met Tyr Asn 

185 190 195 
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AAT TTG GGG CTT TTC ATT AAC CAT TAT CCT AAT GGG GTT GTC ACT GTT 678 
Asn Leu Gly Leu Phe He Asn His Tyr Pro Asn Gly Val Val Thr Val 

200 205 210 

AAT TGT GCT CGA ATC ATC CAT GGG AAC CAG ATT GCA ACA AAT GGT GTT 726 
5 Asn Cys Ala Arg He He His Gly Asn Gin He Ala Thr Asn Gly Val 

215 220 225 

GTC CAT GTC ATT GAC CGT GTG CTT ACA CAA ATT GGT ACC TCA ATT CAA 774 
Val His Val He Asp Arg Val Leu Thr Gin lie Gly Thr Ser He Gin 
230 235 240 

10 GAC TTC ATT GAA GCA GAA GAT GAC CTT TCA TCT TTT AGA GCA GCT GCC 822 

Asp Phe He Glu Ala Glu Asp Asp Leu Ser Ser Phe Arg Ala Ala Ala 
245 250 255 260 

ATC ACA TCG GAC ATA TTG GAG GCC CTT GGA AGA GAC GGT CAC TTC ACA 870 
He Thr Ser Asp lie Leu Glu Ala Leu Gly Arg Asp Gly His Phe Thr 
15 265 270 275 

CTC TTT GCT CCC ACC AAT GAG GCT TTT GAG AAA CTT CCA CGA GGT GTC 918 
Leu Phe Ala Pro Thr Asn Glu Ala Phe Glu Lys Leu Pro Arg Gly Val 

280 285 290 

CTA GAA AGG ATC ATG GGA GAC AAA GTG GCT TCC GAA GCT CTT ATG AAG 966 
20 Leu Glu Arg He Met Gly Asp Lys Val Ala Ser Glu Ala Leu Met Lys 

295 300 305 

TAC CAC ATC TTA AAT ACT CTC CAG TGT TCT GAG TCT ATT ATG GGA GGA 1014 
Tyr His He Leu Asn Thr Leu Gin Cys Ser Glu Ser He Met Gly Gly 
310 315 320 

25 GCA GTC TTT GAG ACG CTG GAA GGA AAT ACA ATT GAG ATA GGA TGT GAC 1062 

Ala Val Phe Glu Thr Leu Glu Gly Asn Thr He Glu He Gly Cys Asp 
325 330 335 340 

GGT GAC AGT ATA ACA GTA AAT GGA ATC AAA ATG GTG AAC AAA AAG GAT 1110 
Gly Asp Ser lie Thr Val Asn Gly He Lys Met Val Asn Lys Lys Asp 
30 345 350 355 

ATT GTG ACA AAT AAT GGT GTG ATC CAT TTG ATT GAT CAG GTC CTA ATT 1158 
lie Veil Thr Asn Asn Gly Val lie His Leu He Asp Gin Val Leu He 

360 365 370 

CCT GAT TCT GCC AAA CAA GTT ATT GAG CTG GCT GGA AAA CAG CAA ACC 1206 
35 Pro Asp Ser Ala Lys Gin Val lie Glu Leu Ala Gly Lys Gin Gin Thr 

375 380 385 

ACC TTC ACG GAT CTT GTG GCC CAA TTA GGC TTG GCA TCT GCT CTG AGG 12 54 

Thr Phe Thr Asp Leu Val Ala Gin Leu Gly Leu Ala Ser Ala Leu Arg 
390 395 400 

40 CCA GAT GGA GAA TAC ACT TTG CTG GCA CCT GTG AAT AAT GCA TTT TCT 1302 

Pro Asp Gly Glu Tyr Thr Leu Leu Ala Pro Val Asn Asn Ala Phe Ser 
405 410 415 420 
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GAT GAT ACT CTC AGC ATG GAT CAG CGC CTC CTT AAA TTA ATT CTG CAG 1350 
Asp Asp Thr Leu Ser Met Asp Gin Arg Leu Leu Lys Leu lie Leu Gin 

425 430 435 

AAT CAC ATA TTG AAA GTA AAA GTT GGC CTT AAT GAG CTT TAG AAC GGG 1398 
5 Asn His lie Leu Lys Val Lys Val Gly Leu Asn Glu Leu Tyr Asn Gly 

440 445 450 

CAA ATA CTG GAA ACC ATC GGA GGC AAA CAG CTC AGA GTC TTC GTA TAT 1446 
Gin lie Leu Glu Thr lie Gly Gly Lys Gin Leu Arg Val Phe Val Tyr 
455 460 465 

10 CGT ACA GCT GTC TGC ATT GAA AAT TCA TGC ATG GAG AAA GGG AGT AAG 1494 

Arg Thr Ala Val Cys lie Glu Asn Ser Cys Met Glu Lys Gly Ser Lys 

470 475 480 

CAA GGG AGA AAC GGT GCG ATT CAC ATA TTC CGC GAG ATC ATC AAG CCA 1542 
Gin Gly Arg Asn Gly ^ia lie His lie Phe Arg Glu lie lie Lys Pro 
15 485 490 495 500 

GCA GAG AAA TCC CTC CAT GAA AAG TTA AAA CAA GAT AAG CGC TTT ACG 1590 
Ala Glu Lys Ser Leu His Glu Lys Leu Lys Gin Asp Lys Arg Phe Thr 

505 510 515 

ACC TTC CTC AGC CTA CTT GAA GCT GCA GAC TTG AAA GAG CTC CTG ACA 1638 
20 Thr Phe Leu Ser Leu Leu Glu Ala Ala Asp Leu Lys Glu Leu Leu Thr 

520 525 530 

CAA CCT GGA GAC TGG ACA TTA TTT GTG CCA ACC AAT GAT GCT TTT AAG 1686 
Gin Pro Gly Asp Trp Thr Leu Phe Val Pro Thr Asn Asp Ala Phe Lys 
535 540 545 

25 GGA ATG ACT AGT GAA GAA AAA GAA ATT CTG ATA CGG GAC AAA AAT GCT 1734 

Gly Me- Thr Ser Glu Glu Lys Glu lie Leu He Arg Asp Lys Asn Ala 

550 555 560 

CTT CAA AAC ATC ATT CTT TAT CAC CTG ACA CCA GGA GTT TTC ATT GGA 1782 
Leu Gin Asn He He Leu Tyr His Leu Thr Pro Gly Val Phe He Gly 
30 565 570 575 580 

AAA GGA TTT GAA CCT GGT GTT ACT AAC ATT TTA AAG ACC ACA CAA GGA 1830 
Lys G3y Phe Glu Pro Gly Val Thr Asn He Leu Lys Thr Thr Gin Gly 

585 590 595 

AGC AAA ATC TTT CTG AAA GAA GTA AAT GAT ACA CTT CTG GTG AAT GAA 1878 
35 Ser Lys He Phe Leu Lys Glu Val Asn Asp Thr Leu Leu Val Asn Glu 

600 605 610 

TTG AAA TCA AAA GAA TCT GAC ATC ATG ACA ACA AAT GGT GTA ATT CAT 1926 
Leu Lys Ser Lys Glu Ser Asp He Met Thr Thr Asn Gly Val He His 
615 620 625 

4 0 GTT GTA GAT AAA CTC CTC TAT CCA GCA GAC ACA CCT GTT GGA AAT GAT 1974 

Val Val Asp Lys Leu Leu Tyr Pro Ala Asp Thr Pro Val Gly Asn Asp 
630 635 640 
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CAA CTG CTG GAA ATA CTT AAT AAA TTA ATC AAA TAC ATC CAA ATT AAG 2022 
Gin Leu Leu Glu He Leu Asn Lys Leu He Lys Tyr He Gin He Lys 
645 650 655 660 

TTT GTT CGT GGT AGC ACC TTC AAA GAA ATC CCC GTG ACT GTC TAT AGA 2070 
5 Phe Val Arg Gly Ser Thr Phe Lys Glu He Pro Val Thr Val Tyr Arg 

665 670 675 

CCC ACA CTA ACA AAA GTC AAA ATT GAA GGT GAA CCT GAA TTC AGA CTG 2118 
Pro Thr Leu Thr Lys Val Lye He Glu Gly Glu Pro Glu Phe Arg Leu 

680 685 690 

10 ATT AAA GAA GGT GAA ACA ATA ACT GAA GTG ATC CAT GGA GAG CCA ATT 2166 

He Lys Glu Gly Glu Thr He Thr Glu Val He His Gly Glu Pro He 

695 700 705 

ATT AAA AAA TAC ACC AAA ATC ATT GAT GGA GTG CCT GTG GAA ATA ACT 2214 
lie Lys Lys Tyr Thr Lys He He Asp Gly Val Pro Val Glu He Thr 
15 710 715 720 

GAA AAA GAG ACA CGA GAA GAA CGA ATC ATT ACA GGT CCT GAA ATA AAA 2262 
Glu Lys Glu Thr Arg Glu Glu Arg He He Thr Gly Pro Glu He Lys 
725 730 735 740 

TAC ACT AGG ATT TCT ACT GGA GGT GGA GAA ACA GAA GAA ACT CTG AAG 2310 
20 Tyr Thr Arg He Ser Thr Gly Gly Gly Glu Thr Glu Glu Thr Leu Lys 

745 750 755 

AAA TTG TTA CAA GAA GAC ACA CCC GTG AGG AAG TTG CAA GCC AAC AAA 2358 
Lys Leu Leu Gin Glu Asp Thr Pro Val Arg Lys Leu Gin Ala Asn Lys 

760 765 770 

25 AAA AGT TCA AGG ATC TAGAAGACGA TTAAGGGAAG GTCGTTCTCA GTGAAAATCC 2413 

Lys Ser Ser Arg He 
775 

AAAAACCAGA AAAAAATGTT TATACAACCC TAAGTCAATA ACCTGACCTT AGAAAATTGT 2473 

GAGAGCCAAG TTG ACT TC AG GAACTGAAAC ATCAGCACAA AGAAGCAATC ATCAAATAAT 2 533 

30 TCTGAACACA AATTTAATAT TTTTTTTTCT GAATGAGAAA CATGAGGGAA ATTGTGGAGT 2 593 

i& TAGCCTCCTG TGGTAAAGGA ATTGAAGAAA ATATAACACC TTACACCCTT TTTCATCTTG 2 65 3 

ACATTAAAAG TTCTGGCTAA CTTTGGAATC CATTAGAGAA AAATCCTTGT CACCAGATTC 2713 

ATTACAATTC AAATCGAAGA GTTGTGAACT GTTATCCCAT TGAAAAGACC GAGCCTTGTA 2773 

% TGTATGTTAT GGATACATAA AATGCACGCA AGCCATTATC TCTCCATGGG AAGCTAAGTT 2833 

35 ATAAAAATAG GTGCTTGGTG TACAAAACTT TTTATGATCA AAAGGCTTTG CACATTTCTA 2893 

TATGAGTGGG TTTACTGGTA AATTATGTTA TTTTTTACAA CTAATTTTGT ACTCTCAGAA 2953 

TGTTTGTCAT ATGCTTCTTG CAATGCATAT TTTTTAATCT C AAACG TTT C AATAAAACCA 3013 

TTTTTCAGAT ATAAAGAGAA TTACTTCAAA TTGAGTAATT CAGAAAAACT CAAGATTTAA 3073 

GTTAAAAAGT GGTTTGGACT TGGGAATAGG ACTTTATACC TCTTTCTCGT GCC 3126 

4 0 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 777 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
5 Met Ala Leu Pro Ala Arg lie Leu Ala Leu Ala Leu Ala Leu Ala Leu 

15 10 15 

Gly Pro Ala Val Thr Leu Ala Asn Pro P. la Arg Thr Pro Tyr Glu Leu 

20 25 30 

Val Leu Gin Lys Ser Ser Ala Arg Gly Gly Arg Asp Gin Gly Pro Asn 
10 35 40 45 

Val Cys Ala Leu Gin Gin lie Leu Gly Thr Lys Lys Lys Tyr Phe Ser 

50 5 5 60 

Thr Cys Lys Asn Trp Tyr Lys Lys Ser lie Cys Gly Gin Lys Thr Thr 
65 70 75 80 

15 Val Leu Tyr Glu Cys Cys Pro Gly Tyr Met Arg Met Glu Gly Met Lys 

85 90 95 

Gly Cys Pro Ala Val Leu Pro He Asp His Val Tyr Gly Thr Leu Gly 
i 100 105 110 

He Val Gly Ala Thr Thr Thr Gin Arg Tyr Ser Asp Ala Ser Lys Leu 
20 115 120 125 

Arg Glu Glu He Glu Gly Lys Gly Ser Phe Thr Tyr Phe Ala Pro Ser 

130 135 140 

Asn Glu Ala Trp Asp Asn Leu Asp Ser Asp He Arg Arg Gly Leu Glu 
145 150 155 160 

2 5 Ser Asn Val Asn Val Glu Leu Leu Asn Ala Leu His Ser- His Met He 

165 170 175 

Asn Lys Arg Met Leu Thr Lys Asp Leu Lys Asn Gly Met He He Pro 

ISO 185 190 

Ser Met Tyr Asn Asn Leu Gly Leu Phe He Asn His Tyr Pro Asn Gly 
30 195 200 205 

Val Val Thr Val Asn Cys Ala Arg He He His Gly Asn Gin He Ala 

210 215 220 

Thr Asn Gly Val Val His Val He Asp Arg Val Leu Thr Gin He Gly 
225 230 235 240 

/' 3 5 Thr Ser He Gin Asp Phe He Glu Ala Glu Asp Asp Leu Ser Ser Phe 

245 250 255 

Arg Ala Ala Ala He Thr Ser Asp He Leu Glu Ala Leu Gly Arg Asp 

260 265 270 

Gly His Phe Thr Leu Phe Ala Pro Thr Asn Glu Ala Phe Glu Lys Leu 
40 275 280 285 

Pro Arg Gly Val Leu Glu Arg He Met Gly Asp Lys Val Ala Ser Glu 
' : < { 290 295 300 

Ala Leu Met Lys Tyr His He Leu Asn Thr Leu Gin Cys Ser Glu Ser 
305 310 315 320 
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lie Met Gly Gly Ala Val Phe Glu Thr Leu Glu Gly Asn Thr lie Giu 

325 330 335 

He Gly Cys Asp Gly Asp Ser lie Thr Val Asn Gly He Lys Met Val 

340 345 350 

5 Asn Lys Lys Asp He Val Thr Asn Asn Gly Val He His Leu He Asp 

355 360 365 

Gin Val Leu He Pro Asp Ser Ala Lys Gin Val He Glu Leu Ala Gly 

370 375 380 

Lys Gin Gin Thr Thr Phe Thr Asp Leu Val Ala Gin Leu Gly Leu Ala 
10 385 390 395 400 

Ser Ala Leu Arg Pro Asp Gly Glu Tyr Thr Leu Leu Ala Pro Val Asn 

405 410 415 

Asn Ala Phe Ser Asp Asp Thr Leu Ser Met Asp Gin Arg Leu Leu Lys 

420 425 430 

15 Leu He Leu Gin Asn His He Leu Lys Val Lys Val Gly Leu Asn Glu 

435 440 445 

Leu Tyr Asn Gly Gin lie Leu Glu Thr He Gly Gly Lys Gin Leu Arg 

450 455 460 

Val Phe Val Tyr Arg Thr Ala Val Cys He Glu Asn Ser Cys Met Glu 
20 465 470 475 480 

Lys Gly Ser Lys Gin Gly Arg Asn Gly Ala He His He Phe Arg Glu 

485 490 495 

He He Lys Pro Ala Glu Lys Ser Leu His Glu Lys Leu Lys Gin Asp 

500 505 510 

25 Lys Arg Fhe Thr Thr Phe Leu Ser Leu Leu Glu Ala Ala Asp Leu Lys 

515 520 525 

Glu Leu Lau Thr Gin Pro Gly Asp Trp Thr Leu Phe Val Pro Thr Asn 

530 535 540 

Asp Ala Phe Lys Gly Met Thr Ser Glu Glu Lys Glu He Leu He Arg 
30 545 550 555 560 

Asp Lys Asn Ala Leu Gin Asn He He Leu Tyr His Leu Thr Pro Gly 

565 570 575 

Val Phe He Gly Lys Gly Phe Glu Pro Gly Val Thr Asn He Leu Lys 

580 585 590 

35 Thr Thr Gin Gly Ser Lys He Phe Leu Lys Glu Val Asn Asp Thr Leu 

595 600 605 

Leu Val Asn Glu Leu Lys Ser Lys Glu Ser Asp He Met Thr Thr Asn 

610 615 620 

Gly Val He His Val Val Asp Lys Leu Leu Tyr Pro Ala Asp Thr Pro 
40 625 630 635 640 

Val Gly Asn Asp Gin Leu Leu Glu He Leu Asn Lys Leu He Lys Tyr 

645 650 655 

He Gin He Lys Phe Val Arg Gly Ser Thr Phe Lys Glu He Pro Val 

660 665 670 
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Thr Val Tyr Arg Pro Thr Leu Thr Lys Val Lys lie Glu Gly Glu Pro 

675 680 685 

Glu Phe Arg Leu lie Lye Glu Gly Glu Thr lie Thr Glu Val He His 
690 695 700 

5 Gly Glu Pro He He Lys Lys Tyr Thr Lys He He Asp Gly Val Pro 

705 710 715 720 

Val Glu He Thr Glu Lys Glu Thr Arg Glu Glu Arg He He Thr Gly 

725 730 735 

Pro Glu He Lys Tyr Thr Arg He Ser Thr Gly Gly Gly Glu Thr Glu 
10 740 745 750 

Glu Thr Leu Lys Lys Leu Leu Gin Glu Asp Thr Pro Val Arg Lys Leu 

755 760 765 

Gin Ala Asn Lys Lys Ser Ser Arg He 
770 775 

15 (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
fij (A) LENGTH: 10 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
25 TGTCCAGATG 

10 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

3 5 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TGTCCAGATG C 
11 



40 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 12 base pairs 
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(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
(ii) MOLECULE TYPE: cDNA 

5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGTCCAGATG AC 
12 



10 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
^ (iv) ANTI-SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
20 TGTCCAGATA 

10 



v.S 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL : NO 
30 (i\) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 
TGTCCAGACG 
10 

(2) INFORMATION FOR SEQ ID NO: 10: 
3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
40 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 
TGTCCAGCCG 
10 

5 (2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
15 TGTCCCGCCG 

10 

(2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 10 base pairs 
2 0 ( 8 ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
2 5 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
TGCCCGGCCG 
10 



(2) INFORMATION FOR SEQ I? NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TGATGCACTC 
10 
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(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: single 

<D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
( i i i ) HYPOTHET I CAL : NO 
(iv) ANTI-SENSE: NO 
10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TGAGCTACTC 
10 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

20 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TGACTGACTC 
10 

25 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
3 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
35 CTGATCCATG 

10 

(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 683 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
5 (v) FRAGMENT TYPE: internal 

<xl) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Ala Leu Phe Val Arg Leu Leu Ala Leu Ala Leu Ala Leu Ala Leu 
15 10 15 

Gly Pro Ala Ala Thr Leu Ala Gly Pro Ala Lya Ser Pro Tyr Gin Leu 
10 20 25 30 

Pro Leu Gin His Ser Arg Leu Arg Gly Arg Gin His Gly Pro Aan Val 

35 40 45 

Cys Ala Val Thr Lys Val lie Gly Thr Asn Arg Lys Tyr Phe Thr Asn 
50 55 60 

15 Cys Lys Gin Trp Tyr Gin Arg Lys He Cys Gly Lys Ser Thr Val He 

65 70 75 80 

Ser Tyr Glu Cys Cys Pro Gly Tyr Glu Lys Val Pro Gly Glu Lys Gly 

85 90 95 

Cys Pro Ala Ala Leu Pro Leu Ser Asn Leu Tyr Glu Thr Leu Gly Val 
20 100 105 110 

Val Gly Ser Thr Thr Thr Gin Leu Tyr Thr Asp Arg Thr Glu Lys Leu 

115 120 125 

Arg Pro Glu Met Glu Gly Pro Gly Ser Phe Thr He Phe Ala Pro Ser 
130 135 140 

2 5 Asn Glu Ala Trp Ala Ser Leu Pro Ala Glu Val Leu Val Ser Leu Val 

145 150 155 160 

Ser Asn Val Asn He Glu Leu Leu Asn Ala Leu Arg Tyr His Met Val 

165 170 175 

Gly Arg Arg Val Leu Thr Asp Glu Leu Lys His Gly Met Thr Leu Thr 
30 180 185 190 

Ser Met Tyr Gin Asn Ser Asn He Gin lie His His Tyr Pro Asn Gly 

195 200 205 

He Val Thr Val Asn Cys Ala Arg Leu Leu Lys Ala Asp His His Ala 
210 215 220 

35 Thr Asn Gly Val Val His Leu He Asp Lys Val He Ser Thr He Thr 

225 230 235 240 

Asn Asn He Gin Gin He He Glu He Glu Asp Thr Phe Glu Thr Leu 

245 250 255 

Arg Ala Ala Val Ala Ala Ser Gly Leu Asn Thr Met Leu Glu Gly Asn 
40 260 265 270 

Gly Gin Tyr Thr Leu Leu Ala Pro Thr Asn Glu Ala Phe Glu Lys He 

275 280 285 

Pro Ser Glu Thr Leu Asn Arg He Leu Gly Asp Pro Glu Ala Leu Arg 
290 295 300 
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Asp Leu Leu Asn Asn His lie Leu Lys Ser Ala Met Cys Ala Glu Ala 
305 310 315 320 

lie Val Ala Gly Leu Ser Val Glu Thr Leu Glu Gly Thr Thr Leu Glu 

325 330 335 

5 Val Gly Cys Ser Gly Asp Met Leu Thr lie Asn Gly Lys Ala lie He 

340 345 350 

Ser Asn Lye Asp He Leu Ala Thr Asn Gly Val He His Tyr He Asp 

355 360 365 

Glu Leu Leu He Pro Asp Ser Ala Lys Thr Leu Phe Glu Leu Ala Ala 
10 370 375 380 

Glu Ser Asp Val Ser Thr Ala He Asp Leu Phe Arg Gin Ala Gly Leu 
385 390 395 400 

Gly Asn His Leu Ser Gly Ser Glu Arg Leu Thr Leu Leu Ala Pro Leu 

405 410 415 

15 Asn Ser Val Phe Lys Asp Gly Thr Pro Pro He Asp Ala His Thr Arg 

420 425 430 

Asn Leu Leu Arg Asn His lie He Lys Asp Gin Leu Ala Ser Lys Tyr 

435 440 445 

Leu Tyr His Gly Gin Thr Leu Glu Thr Leu Gly Gly Lys Lys Leu Arg 
20 450 455 460 

Val Phe Val Tyr Arg Asn Ser Leu Cys He Glu Asn Ser Cys He Ala 
465 470 475 480 

Ala His Asp Lys Arg Gly Arg Tyr Gly Thr Leu Phe Thr Met Asp Arg 

485 490 495 

25 Val Leu Thr Pro Pro Met Gly Thr Val Met Asp Val Leu Lys Gly Asp 

500 505 510 

Asn Arg Phe Ser Met Leu Val Ala Ala lie Gin Ser Ala Gly Leu Thr 

515 520 525 

Glu Thr Leu Asn Arg Glu Gly Val Tyr Thr Val Phe Ala Pro Thr Asn 
30 530 535 540 

Glu Ala Phe Arg Ala Leu Pro Pro Arg Glu Ser Arg Arg Leu Leu Gly 
545 550 555 560 

Asp Ala Lys Glu Leu Ala Asn He Leu Lys Tyr His He Gly Asp Glu 

565 570 575 

3 5 He Leu Val Ser Gly Gly He Gly Ala Leu Val Arg Leu Lys Ser Leu 

580 585 590 

Gin Gly Asp Lys Leu Glu Val Ser Leu Lys Asn Asn Val Val Ser Val 

595 600 605 

Asn Lys Glu Pro Val Ala Glu Pro Asp He Met Ala Thr Asn Gly Val 
40 610 615 620 

Val His Val He Thr Asn Val Leu Gin Pro Pro Ala Asn Arg Pro Gin 
625 630 635 640 

Glu Arg Gly Asp Glu Leu Ala Asp Ser Ala Leu Glu He Phe Lys Gin 

645 650 655 
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Ala Ser Ala Phe Ser Arg Ala Ser Gin Arg Ser Val Arg Leu Ala Val 

660 665 670 

Pro Tyr Gin Lys Leu Leu Glu Arg Met Lys His 
675 680 

5 (2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 206 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
(v) FRAGMENT TYPE : internal 
15 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gly Glu Lys Ser Leu Glu Tyr Lys lie Arg Asp Asp Pro Asp Leu Ser 
15 10 15 

Gin Phe Tyr Ser Trp Leu Glu His Asn Glu Val Ala Asn Ser Thr Leu 

20 25 30 

20 Gin Leu Arg Gin Val Thr Val Phe Ala Pro Thr Asn Leu Ala Gin Phe 

35 40 45 

Asn Tyr Lys Ala Arg Asp Gly Asp Glu Asn He He Leu Tyr His Met 

50 55 60 

Thr Asn Leu Ala His Ser Leu Asp Gin Leu Gly His Lys Val Asn Ser 
25 r -5 70 75 80 

Glu Leu Asp Gly Asn Pro Pro Leu Trp He Thr Arg Arg Arg Asp Thr 

85 90 95 

He Phe Val Asn Asn Ala Arg Val Leu Thr Glu Arg Ser Asn Tyr Glu 

100 105 110 

3 0 Ala Val Asn Arg His Gly Lys Lys Gin Val Leu His Val Val Asp Ser 

115 120 125 

Val Leu Glu Pro Val Trp Ser Thr Ser Gly Gin Leu Tyr Asn Pro Asp 

130 135 140 

Ala Phe Gin Phe Leu Asn Gin Ser Glu Asn Leu Asp Leu Gly Leu His 
35 145 150 155 160 

Arg Val Arg Ser Phe Arg Gin Arg Val Phe Gin Asn Gin Lys Gin Asn 

165 170 175 

Asp Phe Lys Leu Glu Gly Lys His Thr Phe Phe He Pro Val Asp Glu 

180 185 190 

4 0 Gly Phe Lys Pro Leu Pro Arg Pro Glu Lys lie Asp Gin Lys 

195 200 205 
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(2) INFORMATION FOR SEQ ID NO: 19: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 200 amino acids 

(B) TYPE: amino acid 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
10 <v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Ala Ala Ala Asp Leu Ala Asp Lys Leu Arg Asp Asp Ser Glu Leu Ser 
15 10 15 

Gin Phe Tyr Ser Leu Leu Glu Ser Asn Gin lie Ala Asn Ser Thr Leu 
15 20 25 30 

Ser Leu Arg Ser Cys Thr lie Phe Val Pro Thr Asn Glu Ala Phe Gin 

35 40 45 

Arg Tyr Lys Ser Lys Thr Ala His Val Leu Tyr His lie Thr Thr Glu 
50 55 60 

20 Ala Tyr Thr Gin Lys Arg Leu Pro Asn Thr Val Ser Ser Asp Met Ala 

65 70 75 80 

Gly Asn Pro Pro Leu Tyr lie Thr Lys Asn Ser Asn Gly Asp lie Phe 

85 90 95 

Val Gly Asn Ala Arg lie lie Pro Ser Leu Ser Val Glu Thr Asn Ser 
25 100 105 110 

Asp Gly Lys Arg Gin lie Met His lie lie Asp Glu Val Leu Glu Pro 

115 120 125 

Leu Thr V~" Lys Ala Gly His Ser Asp Thr Pro Asn Asn Pro Asn Ala 
130 135 140 

30 Leu Lys Phe Leu Lys Asn Ala Glu Glu Phe Asn Val Asp Asn lie Gly 

145 150 155 160 

Val Arg Thr Tyr Arg Ser Gin Val Thr Met Ala LyB Lys Glu Ser Val 

165 170 175 

Tyr Asp Ala Ala Gly Gin His Thr Phe Leu Val Pro Val Asp Glu Gly 
35 180 185 190 

Phe Lys Leu Ser Ala Arg Ser Ser 
195 200 
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CLAIMS 

What is claimed is: 

1. A monoclonal antibody that binds to an epitope of TCI in formalin- 
fixed or paraffin-embedded tissues. 

2. A monoclonal antibody produced by the hybridoma cell line ATCC No. 
HB 11481, or a monoclonal antibody that binds to the same antigenic 
determinant as a monoclonal antibody produced by the hybridoma cell line 
ATCC No. HB 11481. 

3. A method of screening for agents that inhibit expression of the TCI 
gene in vitro , comprising 

exposing a metastatic cell line in which TCI mRNA is detectable in 
culture to an agent suspected of inhibiting production of said TCI mRNA; 
and 

determining the level of TCI mRNA in said exposed cell line, 
wherein a decrease in the level of TCI mRNA after exposure of said cell 
line to said agent is indicative of inhibition of said TCI mRNA 
production. 

4. A method of screening for agents that inhibit expression of the TCI 
protein in vitro , comprising 

exposing a metastatic cell line in which TCI protein is detectable 
in culture to an agent suspected of inhibiting production of said TCI 
protein; and 

determining the level of TCI protein in said exposed cell line, 
wherein a decrease in the level of TCI protein after exposure of said 
cell line to said agent is indicative of inhibition of said TCI protein 
production. 

5. The method of claim 3, said cell line comprising JMN1B. 

6. The method of claim 4, said cell line comprising JMN1B . 

7. A method of screening for agents that inhibit expression of the TCI 
gene in vivo, comprising 

exposing a mammal having tumor cells in which TCI mRNA is 
detectable to an agent suspected of inhibiting production of said TCI 
mRNA ; and 

determining the level of TCI mRNA in tumor cells of said exposed 
mammal, wnerein a decrease in the level of TCI mRNA after exposure of 
said mammal to said agent is indicative of inhibition of said TCI mRNA 
production. 



WO 95/11923 PCT/US94/12502 



- 49 - 

8. A method of screening for agents that inhibit production of TCI 
protein in vivo, comprising 

exposing a mammal having tumor cells in which TCl protein is 
detectable to an agent suspected of inhibiting production of said TCI 
5 protein; and 

determining the level of TCI protein in tumor cells of said exposed 
mammal, wherein a decrease in the level of TCI protein after exposure of 
said mammal to said agent is indicative of inhibition of said TCI protein 
production. 

10 9. The method of claim 7, wherein said tumor cells are breast tumor 

cells, colon tumor cells, or tumor cells of the gastrointestinal tract. 

10. The method of claim 8 f wherein said tumor cells are breast tumor 
cells, colon tumor cells, or tumor cells of the gastrointestinal tract. 

11. A pharmaceutical composition for use in treating a late stage 
15 cancer comprising an effective amount of an inhibitor of TCI. 

12. The pharmaceutical composition of claim 11 wherein said late stage 
cancer is one of breast cancer, colon cancer, or gastrointestinal cancer. 

13. A pharmaceutical composition for use in preventing tumor cell 
metastasis comprising an effective amount of an inhibitor of TCI. 

2 0 14. A method for detecting a tumor in a subject, comprising detecting 

the presence of tumor marker protein TCI in a sample of body fluid from 
said subject . 

15. A method for detecting a in a subject comprising the steps of: 
providing a sample of body fluid from said subject; 

2 5 contacting said sample with a monoclonal antibody specific for an 

epitope of tumor marker protein TCI; and 

detecting the presence of TCI protein in said sample, wherein the 
presence of TCI protein in said sample is indicative of the presence of 
a tumor in said subject. 

3 0 16. The method of claim 14 or 15, wherein said body fluid is selected 

from the group consisting of blood, urine and sputum. 



17. A method for detecting a tumor in a subject, comprising detecting 
the presence of tumor marker protein TCI, or of mRNA encoding tumor 
marker protein TCl, in a sample of a tissue section from said subject. 
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18. A method for detecting an invasive or metastatic tumor comprising 
the steps of: 

providing a sample of a f ormalin-f ixed or paraffin-embedded tissue 
section from said subject; 

contacting said sample with a monoclonal antibody specific for an 
epitope of tumor marker protein TCI in formalin-fixed or paraffin- 
embedded tissue sections; and 

determining the level of TCI protein in said sample, wherein the 
level of TCI protein in said sample is related to the presence of an 
invasive or metastatic tumor in said subject, 

19. The method of claim 17 or 18, wherein said tissue is breast, colon, 
or gastrointestinal tract tissue. 

20. The method of claim 18, wherein said monoclonal antibody is a 
monoclonal antibody produced by the hybridoma cell line ATCC No. HB 
11481. 

21. The method of claim 18, wherein said monoclonal antibody is a 
monoclonal antibody that binds to the same antigenic determinant as a 
monoclonal antibody produced by the hybridoma cell line ATCC No. HB 
11481. 

22. A kit for diagnosis of an invasive or metastatic tumor in a 
subject, comprising 

the monoclonal antibody of claim 1 or claim 2. 
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AAA CAA GTT ATT GAG CTG GCT GGA AAA CAG GAA ACC ACC TTC ACG GAT CCT GTG GCC CAA 

KQVIELAGKQQTTFTDLVAQ 
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TTA GGC TTG GCA TCT GCT CTG AGG CCA GAT GGA GAA TAC ACT TTG CTG GCA C-. GTG AAT 
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